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Article Info ABSTRACT 

Article history: Cervical cancer is the second most common cancer in women worldwide, 
and occurs when there are presences of abnormal cells in the cervix, which 
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and known for its high accuracy value. Moreover, there is a support vector 
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functions as classifier for the categorization of cervical cancer. 
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1. INTRODUCTION 

Many countries ranked cancer as the second most common health issues, of which cervical cancer 
causes the most of deaths recorded [1]. Cervical cancer occurs when there are abnormal cells in the cervix, 
which continues to grow uncontrollably, and results in benign tumors, which later develops into cervical 
cancer cells that spread to other body parts [2]. 

This cancer is one of the most common disease in women throughout the world with nearly 500,000 
women developing the disease each year, and ranked the fourth most communal malignant disease 
worldwide [3], [4]. In the initial stages, early cervical cancer and pre-cancer do not experience symptoms, 
since they do not show symptoms until the tumor is formed. Most cases are recorded in less developed 
countries with unavailability of effective screening systems [4]. 

Almost all cases are caused by human papillomavirus (HPV) and the risk factors include exposure 
to smoking, and immune-system dysfunction [4]. There are more than one hundred types of HPV; however, 
one of about 15 genotypes of carcinogenic HPV is very common among young women in their first sexual 
activity [5]. Besides carcinogenic risks that are linked to evolutionary species, each genotype acts as an 
independent infection [5]. 

In women's bodies, this virus produces 2 types of proteins, namely E6 and E7. Both of them are 
dangerous, since they deactivate certain genes that play crucial role in stopping tumor development. These 
two proteins also aggressively trigger the growth of uterine cell wall. This unnatural cell growth eventually 
causes gene changes or gene mutations, which then become the cause of cervical cancer that develops in the 
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body. The symptoms that characterized the disease are as follows, unusual bleeding from the vagina, 
irregular menstrual cycles, pain in the hip, low back pain, body weakness and tiredness, weight loss when not 
on a diet, loss of appetite, abnormal vaginal fluid, and leg inflammation. 

Cervical cancers are mostly associated with the low and middle-income countries with 
approximately 90% HPV vaccination programs and uorganized screening [6]. The treatment depends on the 
level of the disease with respect to the available resources and diagnosis made in the early-stages. 
Fertility-preserving surgical procedures have been the care standard for women with low-risk. The overall 
prognosis remains poor for women with metastatic or recurrent disease. Yet, the period of survival is less 
than 12 months, however, the incorporation of the anti— vascular endothelial growth factor (VEGF) agent has 
been able to extend it. 


2. RESEARCH METHOD 
2.1. Convolutional neural network 

Convolutional neural network (CNN) is a type of deep neural networks as a result of the multilayer 
perceptron (MLP) [7], [8]. The difference between CNNs and MLP is their ability of being used in the 
detection and recognition of objects in image forms. CNNs gives better results than neural networks (NNs), 
due to the addition of one layer to CNNs, which is known as the convolutional layer and consist of neurons 
with activation functions, bias, and weight [7]. CNNs is classified into two important parts which are, feature 
extraction and fully-connected layer [8], [9]. Illustration of CNNs is shown in Figure 1 [10]. 


2.1.1. Feature extraction layer 

Feature extraction layer "encodes" an image in the form of the object represented (feature 
extraction) [7]. Hence, CNNs is technically an architecture encompassing several stages, and each input and 
output process, features maps and numerous arrays, while the extraction layer individually comprises of two 
parts, as follows [11]-[13]. 
— Convolutional layer 

Convoluted layer is the main structure of a convolutional neural network (CNN). This layer is 
utilized in the transformation of inputs into a form that is easily processed by going through a filter or kernel 
of a fixed size without losing essential convulated features [14]. In this layer, there are filters (kernels) that 
spread to the entire input, and each unit receives input from the previous layer. Therefore, through 
convolution, the input map is generated between each filter, then shifting the input and using the sum of dot 
products. 
— Pooling 

Pooling is a technique for reducing dimensions with the aid of two common approaches namely the 
average and maximum pooling [15]. This operation is called the max pooling when it uses the highest value, 
while the average pooling uses the medial value. After this, the flattening process takes place, which is the 
reshaping of a pooled structure into a one-dimensional vector, then placed into fully-connected neural 
networks or MLP for classification [14]. 
— MLP layers 

MLP layers is a fully connected multi-layer perceptron that performs the classification operation. 
There are three layers in MLP namely, the hidden layers, input and output layers. The activation function 
uses the rectified linear unit (ReLU), which is quite popular in deep learning due to its simplicity. 


2.1.2. Fully-connected layer 

Fully-connected layer functions based on the feature extraction layer, which is a multidimensional 
array, with flatten (reshape) in the vector feature map [16], [17]. In addition, all active neurons from previous 
layer are linked with the next layer as in neural networks. Therefore, in order to connect properly, individual 
activation (of the previous) ought to be converted into 1-D data. These usually use MLP term, which process 
data with proper classification [18]. Meanwhile, the contrast against convolution layers are the neurons, 
which are connected to a specific input area, while fully-connected occurs in almost all parts. However, both 
continue to perform “dot product” operations; therefore, their functions are not significantly different. 


2.2. Support vector machine 

Support vector machine (SVM) has received much attention in the classification aspect [19]. The 
main field of this study is used to develop SVM algorithm based on the statistical learning theory [20]. SVM 
is also known as one of the effective machine learning and has high classification efficiency [20]. Illustration 
of SVMs is shown in Figure 2 [21]. 
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Figure 1. Illustration of convolutional neural network Figure 2. Illustration of support vector 
machine 


SVM is a machine learning algorithm for classification and regression, which was introduced by 
Vapnik (1990) [22]. Then, Nello Cristianini researched about SVM based on Vapnik results [23]. 
Subsequently, Bernhard Scholkopf developed SVM theory and kernel function [24]. In addition, SVM is an 
initial form for binary classification; however, it is also for multiclass categorization. 

SVM does mapping forms a higher dimensional space for supporting nonlinear classification, and 
constructing the maximal separating hyperplane. For instance, there is a set of firms represented by the value 
of their ratios {x;}, i = 1,...,and a set of associated labels y; E {—1, +1} which describes results as failed or 
healthy. 

The main purpose of SVM is to find the best hyperplane that is written as; 


w:x+b=0 (1) 


The (1) above is able to maximize the margin. 
The optimization problem of SVM is summarized as follow; 


Minimize > w|]? (2) 
Subject to; 
y(w! -xi +b) 2> 1,vi = 1,...,N (3) 


The (2) finds w € R” and b E R” with constrains to (3), along w (weights) and b (bias). Problem in (2) is 
quadratic optimization. 

Therefore, the Lagrange multipliers æ; for each of the constraints in (2) is shown by giving the 
function as; 


1 
L(w,b, a) = F lwl? — Xi- aifyi(w - x; +b) — 13 (4) 
where a = (d4, Q,...,Ay)!. 
When w and b equal to zero, setting the derivatives of L(w, b, a) the equations obtained are, 


aL 
aw ie aiyixi = 0 > w = VL, diyi Xi (5) 
3p T iar Ui = 0 > 2 i-14iyi = 0 (6) 


Then, eliminating w and b from L(w, b, a) using (5) and (6), obtained the dual form as; 

1 
L(a) = max {— 3 i=1 e ViVjQiaj (xi, x; ) + viet ai} (7) 
X yia = 0,a; = 0 (8) 


From (1) which is f(x) =w:x+b, the w and b of regression function is finally obtained as 
follows; 
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W = Vint GY iXi (9) 
1 
b= W, dies i m dimes AmYmXm) (10) 


In this study, linear kernels are used for support vector machines (SVM) [25]. Kernel function 
resolves linear dimension problems and also for algorithms expression in the inner product between two 
vectors [25]. There are several kernel functions with their parameters in Table 1. 


Table 1. Kernel function 


Name Kernel Function 
Linier K(x;,x;) = [xi] "x; 
Polynomial K(x;,x;) = [(t Æ [xi]"x,)]° 


Gaussian Radial Basis Function (RBF) K(x; x;) = exp (—||x; — x; Jo?) 





2.3. Confusion matrix 

Accuracy is one of main parameter that used to observe a classification’s success. Refers to the 
percentage of correct answers at testing stage, confusion matrix used to measures the accuracy. The 
confusion matrix used is shown in Table 2 [25]. 

The formula of accuracy is written as: 


accuracy = ——?——_ (11) 
Tp+TN+tFp+FN 


Tp: Number of samples having cervical cancer and classified correctly. 


Fp: Number of healthy individuals that are incorrectly classified to cervical cancer. 
Fy: Number of samples with cervical cancer that are incorrectly classified as healthy. 
Ty: Number of healthy individuals correctly spotted. 


Table 2. Confusion matrix 


Prediction 
ene Positive Negative 
Positive Tp Fp 
Negative Fy Ty 


3. RESULTS AND ANALYSIS 
3.1. Data 

This paper received database of cervical cancer sufferers, which consisted of 652 informations with 
actual amounts of 607 major and 45 minor data. The minor represented the classes that indicated the presence 
of cervical cancer with label ‘1’, while the major represented the classes that do not indicate the presence of 
cervical cancer with label '0'. There were 25 features used in this study, namely age, number of sexual 
partners, first sexual intercourse, number of pregnancies, smokes (years, packs/year), hormonal 
contraceptives (years), intrauterine device (years), sexually transmitted diseases (STD) (number, 
condylomatosis, vulvo-perineal condylomatosis, syphilis, human immunodeficiency virus (HIV), number of 
diagnosis), diagnosis (cancer, human papillomavirus (HPV)), hinselmann, schiller, and citology. 


3.2. Results 

For the classification method, this research used 20% data for training and 80% data for testing. In 
this study, 1,000 amount of epochs were used for the convolutional neural network with the combination of 
several kernel functions used for the support vector machine. The results were shown in Figures 3, 4, and 5. 
In Figure 3 (a), there was a rise in the accuracy level of the model as many epochs increase. The blue line 
which stands for training data gave higher accuracy of 100%, while the orange line for testing data gave an 
accuracy value of 93.67%. Figure 3 (b) showed that the number of loss (error) decreases as the number of 
epochs decrease. The error found on training data was 0, while error on test data was 0.06. 

Figure 4 (a), showed that the accuracy of the model increases as many epochs increase, since 
the blue line (training data) gave higher accuracy than the orange line (testing data). The accuracy of 
the training data was 100%, while for testing data it was 92.72%. Figure 4 (b) showed that the number of loss 
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(error) decreases as the number of epochs decrease. The error in training data was 0, while in testing data it 
was 0.07. 

Figure 5 (a) showed that the accuracy of the model increases as many epochs increase. Then, the 
training data (blue line) gave higher accuracy than the testing data (orange line). The accuracy of training 
data was 100%, while the testing data was 92.91%. Besides this, Figure 5 (b) showed that the number of loss 
(error) decreases as the number of epoch decrease, which was 0 for training data and 0.07 on test data. 
Table 3 showed the accuracy of each method. 

The comparison of convolutional neural network—support vector machine with some kernels for the 
classification of cervical cancer, were found to properly and correctly predict data. Result showed that the 
convolutional neural network—support vector machine with linear kernel had the best accuracy value of 
93.67% on the test data. While on the training data, all methods gave best accuracy for database 
categorization. Therefore, the best method for the classification of cervical cancer is the convolutional neural 
network-—support vector machine with linear kernel. 
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Figure 3. Convolutional neural network—support vector machine with linear kernel; (a) model accuracy of 
CNN-SVM linear kernel, and (b) model loss of CNN—-SVM linear kernel 


model accuracy model loss 


— tain 
—— test 


accuracy 





0 200 400 600 800 1000 0 200 400 600 800 1000 


Figure 4. Convolutional neural network—support vector machine with polynomial kernel; (a) model accuracy 
of CNN-SVM polynomial kernel and (b) model loss of CNN—-SVM polynomial kernel 


model accuracy model loss 
—— tain 
057 — test 


0.4 


0.3 


accuracy 
loss 


0.2 


0.1 





0.0 


0 200 400 600 800 1000 0 200 400 600 800 1000 
epoch epoch 


Figure 5. Convolutional neural network—support vector machine with Gaussian RBF kernel; (a) model 
accuracy of CNN-SVM Gaussian RBF kernel, and (b) model loss of CNN-SVM Gaussian RBF kernel 
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Table 3. The accuracy of each method 


No Model Training Accuracy Testing Accuracy 
1. CNN-SVM Linear 100% 93.67% 
2. CNN-SVM Polynomial 100% 92.72% 
3. CNN-SVM RBF 100% 92.91% 


4. CONCLUSION 

Predicting the presence of disease by diagnosing with machine learning method help medical staff to 
classify ailments. An early detection of disease is important, since it makes the patient to receive a prompt 
right treatment, which helps to increase the chance of survival and reduce the health risk. Therefore, this 
research focuses on cervical cancer which is a common health problem with 652 data collected and 25 
features observed. The method used was the combination of convolutional neural network-support vector 
machine with several kernel functions as classifier. The experimental results showed that the methods used, 
properly and correctly predicted the data. Based on findings, convolutional neural network-support vector 
machine with linear kernel is the best model for the classification of cervical cancer data as shown in Table 3. 
Therefore, in the future research, this method develops to give higher accuracy and uses a larger database, in 
order to give better results for predicting and classifying different diseases. 
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